Importance of tonal envelope cues in Chinese speech recognition.

نویسندگان

  • Q J Fu
  • F G Zeng
  • R V Shannon
  • S D Soli
چکیده

Recent studies have shown that temporal waveform envelope cues can provide significant information for English speech recognition. This study investigated the use of temporal envelope cues in a tonal language: Mandarin Chinese. In this study, the speech was divided into several frequency analysis bands; the amplitude envelope was extracted from each band by half-wave rectification and low-pass filtering and was used to modulate a noise of the same bandwidth as the analysis band. These manipulations preserved temporal and amplitude cues in each frequency band, but removed the spectral detail within each band. Chinese vowels, consonants, tones and sentences were identified by 12 native Chinese-speaking listeners with 1, 2, 3, and 4 noise bands. The results showed that the recognition score of vowels, consonants, and sentences increased monotonically with the number of bands, a pattern similar to that observed in English speech recognition. In contrast, tones were consistently recognized at about 80% correct level, independent of the number of bands. This high level of tone recognition produced a significant difference in the open-set sentence recognition between Chinese (11.0%) and English (2.9%) for the one-band condition where no spectral information was available. The data also revealed that, with primarily temporal cues, the falling-rising tone (tone 3) and the falling tone (tone 4) were more easily recognized than the flat tone (tone 1) and the rising tone (tone 2). This differential pattern in tone recognition resulted in a similar pattern in word recognition: words having either tone 3 or 4 were more likely to be recognized while words having tone 1 and 2 were not. The quantitative role of tones in Chinese speech recognition was further explored using a power-function model and found to play a significant role in relating phoneme recognition to sentence recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tonal Languages and Cochlear Implants

As a major part of world languages, tonal languages are spoken in every continent except for Australia. In a tonal language, voice pitch variation (i.e., tone) at the monosyllabic level is a segmental structure that conveys lexical meaning of a word (Duanmu 2000). Mandarin Chinese, a tonal language, is spoken by more people than any other single language, including non-tonal languages. While so...

متن کامل

Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses.

Tone languages differ from English in that the pitch pattern of a single-syllable word conveys lexical meaning. In the present study, dependence of tonal-speech perception on features of the stimulation was examined using an acoustic simulation of a CIS-type speech-processing strategy for cochlear prostheses. Contributions of spectral features of the speech signals were assessed by varying the ...

متن کامل

Temporal and spectral cues in Mandarin tone recognition.

This study evaluates the relative contributions of envelope and fine structure cues in both temporal and spectral domains to Mandarin tone recognition in quiet and in noise. Four sets of stimuli were created. Noise-excited vocoder speech was used to evaluate the temporal envelope. Frequency modulation was then added to evaluate the temporal fine structure. Whispered speech was used to evaluate ...

متن کامل

Investigation of the relative perceptual importance of temporal envelope and temporal fine structure between tonal and non-tonal languages

In this paper, we investigate the relative perceptual importance of the temporal envelop (TE) and temporal fine structure (TFS) between tonal language and non-tonal language perception. The “auditory chimera” experiment is conducted on both American English and Mandarin Chinese with the same conditions. Our experimental results show that there is no significant perceptual difference of TE and T...

متن کامل

The Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Sentence Recognition

Acoustic temporal envelope (E) cues containing speech information are distributed across the frequency spectrum. To investigate the relative weight of E cues in different frequency regions for Mandarin sentence recognition, E information was extracted from 30 contiguous bands across the range of 80-7,562 Hz using Hilbert decomposition and then allocated to five frequency regions. Recognition sc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • The Journal of the Acoustical Society of America

دوره 104 1  شماره 

صفحات  -

تاریخ انتشار 1998